Output-sensitive Skyline Algorithms in External Memory

نویسندگان

  • Xiaocheng Hu
  • Cheng Sheng
  • Yufei Tao
  • Yi Yang
  • Shuigeng Zhou
چکیده

This paper presents new results in external memory for finding the skyline (a.k.a. maxima) of N points in d-dimensional space. The state of the art uses O((N/B) log M/B(N/B)) I/Os for fixed d ≥ 3, and O((N/B) logM/B(N/B)) I/Os for d = 2, where M and B are the sizes (in words) of memory and a disk block, respectively. We give algorithms whose running time depends on the number K of points in the skyline. Specifically, we achieve O((N/B) log M/B(K/B)) expected cost for fixed d ≥ 3, and O((N/B) logM/B(K/B)) worst-case cost for d = 2. As a side product, we solve two problems both of independent interest. The first one, the M -skyline problem, aims at reporting M arbitrary skyline points, or the entire skyline if its size is at most M . We settle this problem in O(N/B) expected time in any fixed dimensionality d. The second one, the M -pivot problem, is more fundamental: given a set S of N elements drawn from an ordered domain, it outputs M evenly scattered elements (called pivots) from S, namely, S has asymptotically the same number of elements between each pair of consecutive pivots. We give a deterministic algorithm for solving the problem in O(N/B) I/Os.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Study on External Memory Scan-Based Skyline Algorithms

Skyline queries return the set of non-dominated tuples, where a tuple is dominated if there exists another with better values on all attributes. In the past few years the problem has been studied extensively, and a great number of external memory algorithms have been proposed. We thoroughly study the most important scan-based methods, which perform a number of passes over the database in order ...

متن کامل

Skyline Computation with Noisy Comparisons

Given a set of n points in a d-dimensional space, we seek to compute the skyline, i.e., those points that are not strictly dominated by any other point, using few comparisons between elements. We study the crowdsourcing-inspired setting ([FRPU94]) where comparisons fail with constant probability. In this model, Groz & Milo [GM15] show three bounds on the query complexity for the skyline problem...

متن کامل

Faster output-sensitive skyline computation algorithm

a r t i c l e i n f o a b s t r a c t We present the second output-sensitive skyline computation algorithm which is faster than the only existing output-sensitive skyline computation algorithm [1] in worst case because our algorithm does not rely on the existence of a linear time procedure for finding medians.

متن کامل

Dissertation Defense Efficient and Adaptive Skyline Computation

Abstract: Skyline, also known as Maxima in computational geometry or Pareto in business management field, is important for many applications involving multi-criteria decision making. The skyline of a set of multi-dimensional data points consists of the points for which no other point exists that is better in at least one dimension and at least as good in every other dimension. Although skyline ...

متن کامل

External Memory Algorithms for String Problems

In this paper we present external memory algorithms for some string problems. External memory algorithms have been developed in many research areas, as the speed gap between fast internal memory and slow external memory continues to grow. The goal of external memory algorithms is to minimize the number of input/output operations between internal memory and external memory. These years the sizes...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013